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Abstract 

In  this  paper  we  diseuss  studies  we  have  been  eondueting  on  human-robot  interaetion  (HRJ)  during  the 
Urban  Seareh  and  Reseue  (USAR)  eompetitions  in  the  NIST  Referenee  Test  Arena.  We  diseuss  some  of 
the  analyses  we  have  already  done  on  the  data  we  have  eolleeted  and  present  the  guidelines  we  have 
produeed  based  on  these  studies.  We  diseuss  future  plans  for  augmenting  USAR  eompetitions  to 
speeifieally  eompare  different  methods  of  HRI. 


Introduction 

The  ultimate  evaluation  of  how  humans  and  robots  internet  is  the  measure  of  their  eombined  performanee. 
In  seareh  and  reseue  the  human-robot  team  has  two  goals:  to  loeate  vietims  and  to  provide  aeeurate 
information  about  their  loeation  and  their  alertness  state  to  human  reseuers.  These  goals  need  to  be 
aehieved  under  a  number  of  eonstraints.  Teams  need  to  operate  for  extended  periods  of  time;  the  number 
of  personnel  used  in  the  operation  should  be  limited  due  to  the  dangerous  nature  of  the  operation;  and  the 
tasks  need  to  be  aeeomplished  quiekly  to  maximize  the  lives  that  ean  be  saved  [2].  Many  human-robot 
seareh  and  reseue  teams  have  partieipated  in  Urban  Search  and  Rescue  (USAR)  competitions  in  the  NIST 
test  arenas  17,8].  The  overall  scoring  for  these  competitions  emphasizes  these  goals  and  constraints. 
Although  scoring  varies  from  year  to  year,  the  teams  are  rewarded  for  locating  victims  in  a  timely  fashion, 
accurately  assessing  their  condition,  and  providing  good  maps  for  rescue  workers.  Teams  are  penalized  for 
causing  further  damage  to  the  collapsed  structure.  Teams  requiring  multiple  operators  for  individual  robots 
are  also  penalized. 

Good  human-robot  interaction  (HRI)  contributes  heavily  to  a  team’s  overall  score.  However,  there  are  a 
number  of  other  contributing  factors  as  well,  including  the  mobility  of  the  robot,  the  skill  of  the  operator, 
the  robustness  of  the  hardware,  software,  and  communications,  and  the  sensory  perception  provided.  We 
are  interested  in  evaluating  the  various  user  interfaces  to  determine  what  information  and  information 
presentation  contributes  to  the  overall  performance  of  the  system. 


Pros  and  Cons  of  Using  the  USAR  Competitions  for  HRI 
Evaluation 

The  primary  benefit  of  using  these  competitions  for  studying  HRI  evaluation  is  that  the  competitors  provide 
many  more  ideas  for  user  interfaces  than  we,  as  researchers,  could  possibly  prototype  and  test. 

The  limitations  are  that  we  can  only  study  the  operator  role  [11].  The  operators  in  the  competitions  are 
expert  users,  i.e.,  robotics  researchers.  We  are  not  allowed  to  interfere  with  the  competition  environment 
which  means  that  we  cannot  collect  think-aloud  or  talk-aloud  protocols  [5,6]  from  the  operators.  It  is 
difficult  to  interview  the  operators  after  their  runs  as  they  are  busy  getting  ready  for  their  next  round. 
Moreover,  the  teams  come  from  allover  the  world  and  there  are  language  barriers  to  overcome.  The  user 
interfaces  and  the  robots  are  dynamic.  The  teams  make  changes  during  the  competitions.  Different  robots 
are  used;  different  sensors  are  used;  different  teammates  take  turns  at  being  the  operator. 
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In  spite  of  the  limitations,  these  competitions  provide  a  rich  source  of  data  in  a  reasonable  USAR 
simulation. 


Data  Collection 

We  have  collected  data  at  six  major  competitions  starting  in  2002.  We  collect  video  data  of  the  user 
interface,  the  operator,  and  the  robot  as  it  moves  through  the  arena.  In  addition  we  collect  information 
about  the  robot’s  path  and  coverage  of  the  arenas.  We  also  have  access  to  the  overall  performance  scores 
including  penalties  occurred. 

We  typically  tap  into  the  video  output  of  the  operator  control  unit  (OCU)  and  direct  this  to  a  scan  converter 
which  sends  the  converted  output  to  a  video  recorder  for  later  analysis.  As  the  setup  time  for  teams  to  get 
ready  for  their  rounds  is  between  10  and  15  minutes,  data  collection  setup  has  to  be  quick  and  flawless. 
Prior  to  the  initial  rounds,  we  test  out  the  data  collection  equipment  with  each  team  who  agrees  to 
participate  in  our  study.  We  make  sure  that  all  the  video  is  time  stamped  so  that  we  can  easily  move 
between  the  operator  view  of  the  user  interface  and  ground  truth  as  represented  by  the  robot  moving  in  the 
arena  during  analysis.  It  is  difficult  to  tape  the  movement  of  the  robot  in  the  arena,  as  portions  of  the  arenas 
are  covered.  Debris  and  multiple  levels  in  the  arenas  make  it  difficult  to  see  the  robot  at  all  times  without 
being  physically  in  the  arena.  We  try  to  capture  data  from  outside  of  the  arena  as  our  presence  can  cause 
the  sensors  on  the  robot  to  mistakenly  identify  us  as  victims  or  unintentionally  point  out  possible  paths 
through  the  debris.  Figure  1  shows  three  different  sections  of  the  Robocup  2004  NIST  test  arenas. 


Figure  1 :  NIST  Test  Arenas  at  the  Robocup  USAR  2004  Competition 


Analysis  of  Data 

We  have  completed  analysis  of  two  sets  of  data  at  this  point  in  time.  Our  initial  data  analysis  was 
completed  on  data  collected  at  the  2002  USAR  competition  at  the  American  Association  of  Artificial 
Intelligence  [14].  We  collected  data  from  all  the  teams  in  the  competitions  but  we  coded  only  data  from  the 
four  top  ranking  teams.  We  used  the  data  from  the  semifinals  and  finals.  We  were  interested  in  looking  at 
how  the  overall  performance  correlated  with  a  finer  analysis  of  performance.  We  looked  at  the  video  tapes 
and  coded  the  amount  of  time  each  team  spent  in  navigation  or  monitoring  navigation,  in  identifying 
victims,  in  logistics,  and  in  failures.  Table  1  contains  the  definitions  of  these  terms. 


Table  1:  Definitions  of  Coded  Activities 


Activity  Coded 

Definition 

Navigation  or  monitoring  navigation 

This  activity  was  coded  when  operators  were 
teleoperating  a  robot,  or  in  the  case  of  semi- 
autonomous  robots,  when  the  operator  was  issuing 
navigation  commands  and  watching  the  user 
interface  to  assess  how  the  robot  was  moving. 

Victim  identification 

We  coded  this  activity  when  the  operator  thought  he 
had  sensed  a  victim  and  moved  closer  or  used  other 
sensors  to  assess  the  status  of  the  victim. 

Logistics 

Activities  such  as  starting  up  another  robot  were 
coded  as  logistics. 

Failures 

Hardware,  software,  and  communications  dropouts 
were  coded  as  failures. 

Table  2  shows  the  percentage  of  times  the  four  teams  spent  in  these  activities.  Note  that  we  were  only  able 
to  code  two  of  the  three  runs  due  to  issues  with  the  data  collection  mechanism.  The  total  time  is  given  in 
minutes.  Each  team  was  allocated  15  minutes  for  their  runs.  It  was  difficult  to  actually  coordinate  with  the 
competition  officials  to  know  the  actual  start  and  stop  times  and,  in  one  case,  we  lost  some  time  due  to  a 
data  collection  issue.  Note  that  the  percentages  do  always  add  up  to  100%.  This  is  basically  due  to 
rounding  areas  in  calculating  times. 

Table  2:  How  teams  spent  their  time _ 


Run 

Total 

Time 

(min) 

%  Time 

Navigation/ 

Monitoring 

Navigation 

Victim  ID 

Failure 

Logistics 

Team  A 

1 

10:39 

46 

51 

0 

3 

3 

14:45 

62 

18 

19 

1 

Team  B 

1 

14:33 

81 

19 

0 

0 

3 

16:42 

77 

23 

0 

0 

Team  C 

1 

13:26 

59 

23 

17 

0 

3 

14:39 

69 

12 

18 

0 

Team  D 

1 

15:12 

55 

32 

0 

12 

3 

13:30 

87 

4 

0 
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We  found  that  teams  using  some  sort  of  automatic  mapping  were  more  successful  in  navigating  the  arenas. 
Operators  who  had  to  keep  maps  in  their  heads  became  confused  about  where  they  were  at  times.  We 
looked  at  the  penalties  incurred  by  the  teams  and  found  instances  where  the  operators  were  unaware  of  the 
surroundings  of  the  robots  or  the  status  of  the  robots.  In  particular,  few  robots  in  this  particular  competition 
had  a  view  of  what  was  behind  them.  In  situations  where  the  operator  was  forced  to  back  up  or  to  make  a 
series  of  tight  turns  this  resulted  in  penalties  for  bumping  into  walls  or  victims. 

In  our  analysis  of  a  second  set  of  data  collected  at  Robocup  2003,  we  looked  at  issues  of  awareness  [9]. 
Burke  [1]  identified  situation  awareness  (SA)  as  a  major  component  needed  for  effective  human-robot 
performance.  Scholtz[10]  has  modified  Endsley’s  SAGAT  [4]  methodology  for  measuring  the  SA 
provided  by  supervisory  interfaces  for  semi-automonous  driving  vehicles.  Scholtz  also  analyzed  the  time 
needed  for  operator  acquisition  of  SA  in  two  types  of  terrains  [12,  13].  In  this  analysis  we  used  a 
modification  of  awareness  tailored  to  HRI  [3]. 


If  we  consider  teams  consisting  of  humans  and  robots,  we  can  define  5  types  of  awareness: 

•  Human-robot  awareness 

•  Robot  -  human  awareness 

•  Human-human  awareness 

•  Robot  -  robot  awareness 

•  Humans’  overall  mission  awareness 

In  the  majority  of  teams  competing  in  the  USAR  Test  Arenas,  we  are  able  to  evaluate  only  human-robot 
awareness.  That  is,  does  the  human  have  knowledge  of  the  location,  status,  and  behavior  of  the  robot?  We 
find  few  teams  that  have  multiple  robots  with  any  collaboration  capabilities  (robot-robot  awareness)  or  use 
multiple  operators  (human-human  awareness).  Moreover,  the  current  generation  of  robots  in  these 
competitions  has  no  awareness  of  the  operators’  status  (robot-human  awareness). 

We  used  an  indirect  means  of  assessing  human-robot  awareness  as  we  are  not  able  to  intervene  and  ask  the 
operator  to  verbally  describe  any  given  situation.  We  coded  critical  incidents  observed  in  the  video  tapes 
of  the  robot  moving  in  the  arena.  Critical  incidents  are  defined  as  a  situation  where  the  robot  was  in  a 
position  that  could  potentially  be  harmful  to  the  robot,  the  environment,  a  victim,  or  the  mission. 

Originally,  we  had  intended  to  code  critical  incidents  that  were  “avoided”,  such  as  when  the  robot  was  able 
to  move  through  an  extremely  tight  space  without  causing  any  damage.  However,  we  found  that  we  were 
unable  to  do  this  consistently.  We  were  able  to  consistently  locate  and  code  critical  incidents  that  had  a 
negative  outcome,  e.g.  the  robot  bumped  into  a  wall.  We  classified  the  critical  awareness  incidents  into  one 
of  five  categories:  global  navigation,  local  navigation,  victim  identification,  obstacle  extraction,  vehicle 
state.  Obstacle  encounter  was  coded  when  the  robot  had  actually  run  into  an  obstacle  and  had  to  perform 
maneuvers  to  free  itself.  Vehicle  state  awareness  was  coded  when  the  operator  did  not  realize  that  the  robot 
was  in  other  than  a  normal  state,  e.g.  tipped  over.  In  the  runs  we  coded,  we  found  evidence  of  critical 
incidents  only  in  the  categories  of  local  navigation,  obstacle  extraction,  and  vehicle  state.  We  did  see 
evidence  of  the  other  types  of  critical  incidents  but  these  were  not  in  the  actual  runs  selected  for  coding  (the 
semifinals  and  finals).  Table  3  shows  the  numbers  of  critical  incidents  occurring  for  the  three  teams 
analyzed.  These  three  teams  were  selected  for  analysis  as  they  placed  in  the  final  round  of  the  competition. 


Table  3:  Analysis  of  Critical  Incidents  by  Team 


Local  navigation 

Obstacle  encounter 

Vehicle  state 

Team  A 

4 

6 

5 

Team  B 

1 

9 

2 

Team  C 

10 

11 

5 

Total 

15 

26 

12 

Obstacle  encounters  were  the  most  prevalent  types  of  critical  incidents.  Robots  became  entangled  in  loose 
debris  in  the  arenas  and  it  was  difficult  for  the  operators  to  know  that. 

In  the  most  recent  competition,  Robocup  2004,  we  noted  that  teams  typically  had  one  of  two  sources  of 
situation  awareness  information  implemented.  A  number  of  teams  used  some  sort  of  overhead  cameras  to 
provide  a  frame  of  reference  for  the  robot  in  relationship  to  the  environment.  Other  teams  had 
implemented  some  sort  of  automatic  mapping  software,  using  a  variety  of  sensors,  including  sonar  and 
ladar.  At  this  point  we  have  not  had  time  to  do  a  full  analysis,  but  an  early  analysis  looks  at  the  five  teams 
who  were  in  the  final  tuns.  Table  4  shows  the  penalties  by  team. 


Table  4:  Penalties  by  type  of  situation  awareness 


Penalties  for  teams  using  automatic  mapping 

Penalties  for  teams  using  overhead  cameras 

Team  A 

0 

Team  P 

80 

Team  D 

5 

Team  S2 

5 

Team  SI 

40 

(note  Team  S2  had  only  3  runs  completed  as  they 
had  to  end  one  run  prematurely  due  to  a  problem 
with  the  robot) 

These  penalties  are  all  loeal  navigation  penalties.  That  is,  the  robot  either  bumped  into  the  walls  of  the 
arena  or  into  a  vietim.  While  these  results  should  be  viewed  as  very  preliminary,  our  impression  is  that  the 
automatie  mapping  is  more  helpful  in  providing  situation  awareness.  This  is  not  surprising,  as  the  video 
information,  while  helpful,  still  requires  eonsiderable  interpretation  by  the  operator.  Also,  if  there  happens 
to  be  any  sort  of  eommunieation  interferenee,  the  video  is  extremely  diffieult  to  view.  The  teams  that  we 
analyzed  were  the  top  seoring  teams,  whieh  implies  that  they  had  reasonable  eoverage  of  the  area  and 
loeated  a  number  of  vietims.  Low  seoring  teams  may  have  few  penalties  due  primarily  to  an  inability  to 
move  very  far  into  the  USAR  arena. 

The  majority  of  teams  we  have  analyzed  have  been  teleoperated,  using  autonomy  only  for  sueh  things  as 
mapping.  While  we  have  had  some  fully  autonomous  teams  in  the  eompetitions,  they  have  not  been 
sueeessful  in  navigating  the  diffieult  environment  in  the  test  arena.  In  our  first  analysis,  two  of  the  teams 
operated  in  semi-autonomous  modes.  The  operators  were  responsible  for  overall  navigation,  but  left  the 
loeal  navigation  (obstaele  avoidanee,  waypoint  navigation)  to  the  robots  in  many  instanees.  We  intend  to 
analyze  future  teams  to  determine  how  eritieal  ineidents  ehange  based  on  the  level  of  autonomy. 

Discussion 

Based  on  the  analyses  we  have  eompleted  to  date,  we  have  been  able  to  provide  some  guidelines  for 
human-robot  interaetion  design.  These  are  summarized  below. 

•  Information  for  effeetive  situation  awareness  should  inelude: 

o  a  frame  of  referenee  to  determine  the  position  of  the  robot  relative  to  the  surrounding 
environment 

o  indieators  of  vehiele  state,  sueh  as  piteh,  roll,  traetion  indieators,  indieators  of  sensor 
status,  and  eamera  positions  relative  to  the  robot  body, 
o  a  map  to  provide  global  navigation  information 

•  Minimize  the  number  of  windows  provided  to  the  operator. 

•  Provide  a  fused  view  of  sensor  information. 

•  Support  multiple  robot  operators  in  a  single  display. 

•  Provide  help  from  the  robot  in  determining  what  mode  of  autonomy  is  most  useful. 

To  date,  we  have  been  able  to  analyze  data  eolleeted  during  USAR  eompetitions  to  provide  some 
guidelines  for  the  design  of  effeetive  user  interfaees  for  USAR  robots.  We  are  eneouraged  that  our  work  is 
making  a  differenee  as  the  situation  awareness  offered  in  the  user  interfaees  deployed  in  eurrent 
eompetitions  is  eertainly  inereasing.  The  downside  of  our  work  is  that  the  analysis  takes  eonsiderable  time 
and  by  and  large  the  results  are  eonsumed  by  human-eomputer  interaetion  researehers,  not  roboties 
researehers. 


Future  Pians 

We  are  interested  in  providing  feedbaek  about  HRI  designs  in  a  more  timely  fashion  and  to  the  roboties 
eommunity  more  direetly.  In  the  final  rounds  of  Roboeup  2004,  robots  were  plaeed  in  an  internal  spot  in 
one  of  the  arenas.  The  operator  had  to  first  assess  where  the  robot  was  and  then  devise  a  strategy  for 
moving  out  into  the  arena  to  loeate  vietims.  We  are  eurrently  working  on  devising  extensions  to  this, 
similar  to  the  eompulsories  in  figure  skating  eompetitions.  This  would  help  us  assess  during  the 
eompetition  how  well  the  operator  is  able  to  gain  situation  awareness  based  on  the  user  interfaee.  While 
the  NIST  Referenee  Arenas  provide  a  standard  area  in  whieh  the  robots  have  to  perform,  there  is  no 
guarantee  that  robots  eneounter  the  same  obstaeles.  Moreover,  due  to  variations  in  size  and  mobility,  we 
eannot  expeet  robots  to  do  equally  well  in  navigating  the  same  environment. 

It  is  important  for  effeetive  seareh  and  reseue  that  teams  are  in  eontrol.  This  means  having  good  SA  at  all 
time  about  where  team  members,  ineluding  robots,  are  and  what  they  are  doing.  Situation  awareness  eould 
be  demonstrated  by  plaeing  robots  in  speeifie  situations  (sueh  as  elose  to  obstaeles,  on  different  types  of 
surfaees  and  grades,  or  near  negative  obstaeles)  and  measuring  the  time  and  aeeuraey  of  the  SA  by  the 


operator  -robot  team.  We  are  working  on  a  user  interface  design  for  our  own  robotics  platform.  It  would 
be  possible  to  consider  our  performance  as  a  baseline  that  the  teams  should  try  to  best. 
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